Genome-Wide Association Analysis of Yield-Related Traits and Candidate Genes in Vegetable Soybean

Owing to the rising demand for vegetable soybean products, there is an increasing need for high-yield soybean varieties. However, the complex correlation patterns among quantitative traits with genetic architecture pose a challenge for improving vegetable soybean through breeding. Herein, a genome-wide association study (GWAS) was applied to 6 yield-related traits in 188 vegetable soybean accessions. Using a BLINK model, a total of 116 single nucleotide polymorphisms (SNPs) were identified for plant height, pod length, pod number, pod thickness, pod width, and fresh pod weight. Furthermore, a total of 220 genes were found in the 200 kb upstream and downstream regions of significant SNPs, including 11 genes encoding functional proteins. Among them, four candidate genes, Glyma.13G109100, Glyma.03G183200, Glyma.09G102200, and Glyma.09G102300 were analyzed for significant haplotype variations and to be in LD block, which encode MYB-related transcription factor, auxin-responsive protein, F-box protein, and CYP450, respectively. The relative expression of candidate genes in V030 and V071 vegetable soybean (for the plant height, pod number, and fresh pod weight of V030 were lower than those of the V071 strains) was significantly different, and these genes could be involved in plant growth and development via various pathways. Altogether, we identified four candidate genes for pod yield and plant height from vegetable soybean germplasm. This study provides insights into the genomic basis for improving soybean and crucial genomic resources that can facilitate genome-assisted high-yielding vegetable soybean breeding.


Introduction
Soybean (Glycine max (L.) Merr.) is a globally cultivated oil-seed crop.It is also an important source of high-quality vegetable protein [1].Based on the different harvesting times, soybeans can be divided into two crop types: grain soybean and vegetable soybean (edamame).Vegetable soybeans are harvested at the R6 stage, when the pods are green and the seeds are fully developed [2].The large pods and large grains are two important visual qualities of vegetable soybeans [3,4].Therefore, pod quality has long been considered one of the most important traits in vegetable soybean breeding.Additionally, vegetable soybean yield is known to be affected by plant height, pod number, pod width, pod thickness, pod length, and fresh pod weight per plant.With the accelerated development of the global economy and population growth, being an inexpensive source of vegetable protein and oil, the demand for vegetable soybeans has sharply increased [5][6][7], but fewer reports focus on the yield of vegetable soybean compared to grain soybean at present.Therefore, it is necessary to analyze the genetic basis of soybean yield in R6 (the mature stage of the vegetable soybean) and to improve the overall yield of vegetable soybean crops.Understanding the genetic variations related to yield traits is necessary for successful plant breeding [8].Planting architecture is a key factor affecting soybeans' planting density and seed yield.The ideal soybean plant structure not only optimizes the structure of the canopy but also prevents lodging, improves photosynthetic efficiency, and achieves a higher yield [7,8].In terms of crop improvement, vegetable soybeans are assessed based on soybean yield per plant, pod number, pod length, pod width, pod thickness, plant height, and branch number.Therefore, identifying genetic loci associated with yield-related traits could contribute to the breeding of high yielding vegetable soybean.Furthermore, it could improve our understanding of the complex interaction of yield-related traits and working molecular mechanisms to improve vegetable soybean yield [9][10][11][12].
Genome-wide association studies (GWAS) are a highly effective method for dissecting the natural variation of quantitative traits based on linkage disequilibrium, which can provide a theoretical basis for analyzing the genetic structure and molecular pathways for the improvement of complex soybean traits [11,13,14].In one such study, 125 candidate selection regions and 5 potential candidate genes related to agronomic traits were predicted using GWAS [15].Zhang et al. identified 22 loci with minor effects and predicted 3 important candidate genes on chromosome 19 using the results of GWAS [16].In another study, a total of 58 SNP S significantly correlated with seed weight, internode number, and plant height, were identified using GWAS, and 28 candidate genes related to yield traits were predicted [17].In addition, 27 quantitative trait nucleotides (QTNs) associated with seed size correlations were identified using GWAS [18,19].Although the plant architecture and yield traits of soybean have been identified, mining yield trait-related genes in vegetable soybean (the mature stage of vegetable soybeans) has rarely been reported.Thus, quantitative genetics can identify genomic regions associated with yield traits, which could facilitate the breeding of improved and higher-yielding soybean cultivars.
Considering the gaps in the current research, we collected and evaluated 188 diverse vegetable soybean genotypes for 6 yield-related traits: the number of pods per plant, plant height, fresh pod weight, pod width, pod thickness, and pod length.Furthermore, GWAS was used to identify genetic loci and candidate genes for yield traits.The genetic network underlying the six traits through the LD of the linked loci with significant phenotypic correlation was further characterized.These results provide an important theoretical understanding of the genetic basis of yield traits and will facilitate the future breeding of vegetable soybean.

Phenotype Variations of Yield-Related Traits in Vegetable Soybean
To identify important yield-related candidate genes from vegetable soybean germplasm resources, we conducted whole-genome re-sequencing and GWAS on a panel of 188 vegetable soybean accessions.Firstly, six yield-related traits of vegetable soybeans were investigated in the R6 stage: plant height, fresh pod weight per plant, pod length, number of pods per plant, pod width, and pod thickness.Among them, the vegetable soybean accessions V071 (plant height: 55.45 cm; fresh pod weight per plant: 215.67 g; number of pods per plant: 80) had higher yield-related traits than V030 (plant height: 10.10 cm; fresh pod weight per plant: 29.3 g; number of pods per plant: 5).Meanwhile, it can be seen that the variation coefficients of pod length (13.22%), pod width (14.65%), and pod thickness (12.75%) were relatively small, whereas the variation coefficients of the number of pods per plant (19.41%), plant height (32.08%), and fresh pod weight per plant were relatively moderate (40.91%) (Table 1).The average values of yield-related traits for vegetable soybeans during the R6 period analyzed conform to the normal distribution of data (Figure 1).

Population Structure Analysis of Vegetable Soybean
Further genetic differentiation analysis was conducted for soybean accession tained from North and South China, and as well as from other countries.An adjacen shows that genotypes are divided into about nine different groups (Figure 2A-C).I dition, 66,825 high quality SNPs were used to evaluate population structure acros panel with K values ranging.LnP (D) values continued to increase and significant cha were observed as the K-value changed from 9 to 10 (Figure 2A,B).Thus, the most l value of K is 9, and this division of the soybean panel is consistent with a significant K value (Figure 2A,B).Furthermore, this is consistent with the adjacent phylogeneti (Figure 2C).In addition, 188 vegetable soybean accessions were assessed for linkage quilibrium (LD) using a subset of high-quality markers.At a threshold of R 2 = 0.317 average decay distance of LD was 150.00 kb for all 188 vegetable soybean accessions ure 2D).

Population Structure Analysis of Vegetable Soybean
Further genetic differentiation analysis was conducted for soybean accessions obtained from North and South China, and as well as from other countries.An adjacent tree shows that genotypes are divided into about nine different groups (Figure 2A-C).In addition, 66,825 high quality SNPs were used to evaluate population structure across the panel with K values ranging.LnP (D) values continued to increase and significant changes were observed as the K-value changed from 9 to 10 (Figure 2A,B).Thus, the most likely value of K is 9, and this division of the soybean panel is consistent with a significant delta K value (Figure 2A,B).Furthermore, this is consistent with the adjacent phylogenetic tree (Figure 2C).In addition, 188 vegetable soybean accessions were assessed for linkage disequilibrium (LD) using a subset of high-quality markers.At a threshold of R 2 = 0.317, the average decay distance of LD was 150.00 kb for all 188 vegetable soybean accessions (Figure 2D).

Yield-Related SNPs Were Identified in Vegetable Soybean via GWAS
The BLINK model was used to analyze the yield traits of 188 vegetable soybean accessions.A total of 116 significant yield-related SNPs were identified, and the p-values were less than 0.00001 (Figure 3 and Table 2).For fresh pod weight, 45 SNPs were detected and located on chromosomes 2, 5, 6, 7, 9, 10, 11, 14, 15, 16, 18, 19, and 20 (Figure 3A and Table 2); among them, three significantly related SNPs spanning a region of approximately 1.35 Mb (36.20 Mb to 37.55 Mb) were juxtaposed in the Chr10:37341548-37690972 [20].In addition, a total of 8 SNPs associated with fresh pod weight on the 6, 9, 11, and 15 chromosomes were identified.Among them, one significant SNP (ss18491673) was 0.78 Mb away from the known SNP (ss17708693) on chromosome 9.Interestingly, the ss17708693 has been reported to regulate the soybean seed weight [21].Meanwhile, 35 SNPs associated with pod number were identified and located on chromosomes 3, 7, 10, 12, 17, and 20 (Figure 3B and Table 2), and a total of 6 significantly related SNPs related to soybean yield spanning a region of approximately 8.45 Mb (31.00 Mb to 39.50 Mb) were juxtaposed on Chr03:3343094 [22].For plant height, 36 SNPs were identified and located on chromosomes 1, 5, 7, 8, 10, 12, 13, 14, 17, 19, and 20 (Figure 3C and Table 2).Among them, one SNP was located 0.96 Mb away from the AP2 domain encoding gene (Glyma.07g078600)on chromosome 7, which is a large class of plant specific transcription factor that plays an important role in many biological processes, such as growth, development, and abiotic stress responses [23,24].In addition, a total of seven significantly related SNPs spanning a region of approximately 2.5 Mb (21.00 Mb to 23.5 Mb on chromosome 13) were juxtaposed with the SWI/SNF locus (Glyma13g118200), which relates to growth and development [24,25].

Candidate Gene Analysis of Yield-Related Traits in Vegetable Soybean
According to the results of GWAS, ss39469452, ss8350061, ss18491673, ss36769025, and ss22150035 were located in the vicinity of known QTL intervals [20][21][22][23][24][25].For 5 SNPs, genes located in 200 kbp genomic regions of each significant SNP were identified as candidate genes.A total of 56 candidate genes were identified under these significant SNPs.Phytozome 13 and Dicots Plaza 5.0 was used to analyze the function of these genes.Of these genes, 11 were found to be functionally annotated genes, while the remaining genes

Yield-Related SNPs Were Identified in Vegetable Soybean via GWAS
The BLINK model was used to analyze the yield traits of 188 vegetable soybean accessions.A total of 116 significant yield-related SNPs were identified, and the p-values were less than 0.00001 (Figure 3 and Table 2).For fresh pod weight, 45 SNPs were detected and located on chromosomes 2, 5, 6, 7, 9, 10, 11, 14, 15, 16, 18, 19, and 20 (Figure 3A and Table 2); among them, three significantly related SNPs spanning a region of approximately 1.35 Mb (36.20 Mb to 37.55 Mb) were juxtaposed in the Chr10:37341548-37690972 [20].In addition, a total of 8 SNPs associated with fresh pod weight on the 6, 9, 11, and 15 chromosomes were identified.Among them, one significant SNP (ss18491673) was 0.78 Mb away from the known SNP (ss17708693) on chromosome 9.Interestingly, the ss17708693 has been reported to regulate the soybean seed weight [21].Meanwhile, 35 SNPs associated with pod number were identified and located on chromosomes 3, 7, 10, 12, 17, and 20 (Figure 3B and Table 2), and a total of 6 significantly related SNPs related to soybean yield spanning a region of approximately 8.45 Mb (31.00 Mb to 39.50 Mb) were juxtaposed on Chr03:3343094 [22].For plant height, 36 SNPs were identified and located on chromosomes 1, 5, 7, 8, 10, 12, 13, 14, 17, 19, and 20 (Figure 3C and Table 2).Among them, one SNP was located 0.96 Mb away from the AP2 domain encoding gene (Glyma.07g078600)on chromosome 7, which is a large class of plant specific transcription factor that plays an important role in many biological processes, such as growth, development, and abiotic stress responses [23,24].In addition, a total of seven significantly related SNPs spanning a region of approximately 2.5 Mb (21.00 Mb to 23.5 Mb on chromosome 13) were juxtaposed with the SWI/SNF locus (Glyma13g118200), which relates to growth and development [24,25].
Plants 2024, 13, 1442 5 of 15 were hypothetical proteins with no functional annotation.Among these functionally annotated genes, Glyma.03G183200, an auxin-responsive family protein located on chromosome 3 near the significant SNP (ss39469452) for pod number.Candidate genes located near peak SNPs for fresh pod weight were Glyma.09G102300 and Glyma.09G102200,encoding F-box protein and CYP family protein, respectively.Glyma.13G109100 was located near the significant SNP (ss22150035) on the chromosome, which encodes MYB-related transcription factors.

Candidate Gene Analysis of Yield-Related Traits in Vegetable Soybean
According to the results of GWAS, ss39469452, ss8350061, ss18491673, ss36769025, and ss22150035 were located in the vicinity of known QTL intervals [20][21][22][23][24][25].For 5 SNPs, genes located in 200 kbp genomic regions of each significant SNP were identified as candidate genes.A total of 56 candidate genes were identified under these significant SNPs.Phytozome 13 and Dicots Plaza 5.0 was used to analyze the function of these genes.Of these genes, 11 were found to be functionally annotated genes, while the remaining genes were hypothetical proteins with no functional annotation.Among these functionally annotated genes, Glyma.03G183200, an auxin-responsive family protein located on chromosome 3 near the significant SNP (ss39469452) for pod number.Candidate genes located near peak SNPs for fresh pod weight were Glyma.09G102300 and Glyma.09G102200,encoding F-box protein and CYP family protein, respectively.Glyma.13G109100 was located near the significant SNP (ss22150035) on the chromosome, which encodes MYB-related transcription factors.
To further explore the key candidate genes for vegetable soybean yield traits, a haplotype analysis of the above 11 candidate genes was performed (Figure 4).Importantly, the plant height-related candidate gene Glyma.13g109100identified has three haplotypes.Further analysis depicted that 22.34% of landraces and 34.5% of improved cultivars possessed Hap1(TA), which had a higher plant height than Hap3(AA).The number of pods per plant-related candidate gene Glyma.03g183200also had three haplotypes; interestingly, 8% of landraces and 26% of improved cultivars possessed Hap1(GTTCAG), which produced a higher number of pods per plant than Hap3(CATCCA).Simultaneously, two haplotypes were identified for the fresh pod weight-related candidate genes.Among these two genes, Glyma.09g102200 was identified as having three haplotypes: 13% of landraces and 34% of improved cultivars possessed Hap1(TCC), which had a greater fresh pod weight than did cultivars with Hap3(TCT); Glyma.09g102300 was identified as having two haplotypes: 22.8% of landraces and 50% of improved cultivars possessed Hap1(CT), which had a greater fresh pod weight per plant than did cultivars with Hap3(AA).Meanwhile, LD linkage disequilibrium analysis was performed on the above candidate genes, and it was found that the above genes were all in the LD block (Figure 5).It was further indicated that Glyma.13g109100,Glyma.03g183200,Glyma.09g102200, and Glyma.09g102300could be candidate genes to regulate the vegetable soybean plant height and fresh pod weight traits.To further explore the key candidate genes for vegetable soybean yield traits, a haplotype analysis of the above 11 candidate genes was performed (Figure 4).Importantly, the plant height-related candidate gene Glyma.13g109100identified has three haplotypes.Further analysis depicted that 22.34% of landraces and 34.5% of improved cultivars possessed Hap1(TA), which had a higher plant height than Hap3(AA).The number of pods per plant-related candidate gene Glyma.03g183200also had three haplotypes; interestingly, 8% of landraces and 26% of improved cultivars possessed Hap1(GTTCAG), which produced a higher number of pods per plant than Hap3(CATCCA).Simultaneously, two haplotypes were identified for the fresh pod weight-related candidate genes.Among these two genes, Glyma.09g102200 was identified as having three haplotypes: 13% of landraces and 34% of improved cultivars possessed Hap1(TCC), which had a greater fresh pod weight than did cultivars with Hap3(TCT); Glyma.09g102300 was identified as having two haplotypes: 22.8% of landraces and 50% of improved cultivars possessed Hap1(CT), which had a greater fresh pod weight per plant than did cultivars with Hap3(AA).Meanwhile, LD linkage disequilibrium analysis was performed on the above candidate genes, and it was found that the above genes were all in the LD block (Figure 5).It was further indicated that Glyma.13g109100,Glyma.03g183200,Glyma.09g102200, and Glyma.09g102300could be candidate genes to regulate the vegetable soybean plant height and fresh pod weight traits.   .Linkage disequilibrium analysis for the candidate genes related to vegetable soybean yield traits. (A).Linkage disequilibrium analysis for the plant height-related candidate genes (Glyma.13g109100) in the 188 vegetable soybeans.(B).Linkage disequilibrium analysis for the pod number-related candidate genes (Glyma.03g183200) in the 188 vegetable soybeans.(C, D).Linkage disequilibrium analysis for the fresh pod number-related candidate genes (Glyma.09g102200and Glyma.09g102300) in the 188 vegetable soybeans.

Different Expression Pattern of Candidate Genes in Pods and Stem
In order to confirm whether candidate genes could regulate the vegetable soybeans height, fresh pod weight, and pod number, the expression patterns of candidate genes were tested via qRT-PCR between V30 and V71 (Figure 6).The relative expression of the candidate genes of Glyma.03g138200 were significantly different between V071 and V030 at the R5, R6, and R7 stages (p < 0.05), indicating that Glyma.03g138200gene was involved in the positive regulation of seed number (Figure 6A).The relative expression of the potential candidate genes associated with fresh pod weight, Glyma.09g102200 and Glyma.09g102300,showed significant expression differences between V071 and V030 at the R5, R6, and R7 stages (p < 0.05).With the increase in the growth period, the expression level of these two genes also increased (Figure 6B,C).Therefore, we speculate that Glyma.09g102200 and Glyma.09g102300play a crucial role in the pod filling period of vegetable soybean seeds.In addition, the plant height-related candidate gene Glyma.13g109100showed significantly different expression levels between V071 and V030 at the R6 stage (p < 0.05), indicating that Glyma.13g109100gene may have been involved in growth and development and positively regulate plant height (Figure 6D).Therefore, we supposed that the candidate genes of Glyma.13g109100,Glyma.03g183200,Glyma.09g102300, and Glyma.09g10220000have the ability to regulate vegetable soybean plant height, pod number, and fresh pod weight.
Figure 5. Linkage disequilibrium analysis for the candidate genes related to vegetable soybean yield traits. (A).Linkage disequilibrium analysis for the plant height-related candidate genes (Glyma.13g109100) in the 188 vegetable soybeans.(B).Linkage disequilibrium analysis for the pod number-related candidate genes (Glyma.03g183200) in the 188 vegetable soybeans.(C, D).Linkage disequilibrium analysis for the fresh pod number-related candidate genes (Glyma.09g102200and Glyma.09g102300) in the 188 vegetable soybeans.

Different Expression Pattern of Candidate Genes in Pods and Stem
In order to confirm whether candidate genes could regulate the vegetable soybeans height, fresh pod weight, and pod number, the expression patterns of candidate genes were tested via qRT-PCR between V30 and V71 (Figure 6).The relative expression of the candidate genes of Glyma.03g138200 were significantly different between V071 and V030 at the R5, R6, and R7 stages (p < 0.05), indicating that Glyma.03g138200gene was involved in the positive regulation of seed number (Figure 6A).The relative expression of the potential candidate genes associated with fresh pod weight, Glyma.09g102200 and Glyma.09g102300,showed significant expression differences between V071 and V030 at the R5, R6, and R7 stages (p < 0.05).With the increase in the growth period, the expression level of these two genes also increased (Figure 6B,C).Therefore, we speculate that Glyma.09g102200 and Glyma.09g102300play a crucial role in the pod filling period of vegetable soybean seeds.In addition, the plant height-related candidate gene Glyma.13g109100showed significantly different expression levels between V071 and V030 at the R6 stage (p < 0.05), indicating that Glyma.13g109100gene may have been involved in growth and development and positively regulate plant height (Figure 6D).Therefore, we supposed that the candidate genes of Glyma.13g109100,Glyma.03g183200,Glyma.09g102300, and Glyma.09g10220000have the ability to regulate vegetable soybean plant height, pod number, and fresh pod weight.

Discussion
In Asia, soybeans are a widely consumed type of legume vegetable, especially pan and China.Owing to the superior nutrition, appearance, and taste of vegetabl bean, there is growing demand, globally, for the cultivation of this type of soybean the 1990s, the market for vegetable soybeans in the United States has increased, rea 10,000 tons in 2000 [29,30].However, the demand for vegetable soybeans cannot b due to the lack of high-yielding soybean varieties.China, the country of origin fo beans, has the most enormous soybean genetic resources worldwide.Thus, we ana the genetic structure of vegetable soybeans and GWAS was used to identify ben genes for molecular breeding.
The GWAS method was used to analyze and calculate the association between type and phenotype variation, and to dissect the genetic basis of important traits [3 instance, in the study of maize, the phenotypic variation of grain yield per plan tassel branch number was reported as 42.37% and 49.79%, respectively; the coeffic variation with grain width and 100-kernel weight was 11% and 19%, respectivel Additionally, plant structure, especially the number of nodes and branches, largely mines the pod number and yield of soybeans.Furthermore, other plant structures, i ing the number and size of pods, are also major components affecting the structure plant and grain yield.Therefore, it is of great significance to explore candidate gen yield-related traits in vegetable soybeans to improve soybean yield through preci netic engineering and breeding.
In this study, 188 vegetable soybean accessions were divided into 9 categories population structure analysis, which indicated some variation within the populatio In addition, during population analysis, the acceptable distance between candidate and markers was determined by LD, and the LD varied among different population The LD decay distance of the 188 vegetable soybean varieties in this study was 150 (r 2 = 0.375), which is within the previously reported range (90-574 kb) [35].In a pre study, the QTL region related to plant height was located on ss21,937,082-ss23,937,

Discussion
In Asia, soybeans are a widely consumed type of legume vegetable, especially in Japan and China.Owing to the superior nutrition, appearance, and taste of vegetable soybean, there is growing demand, globally, for the cultivation of this type of soybean.Since the 1990s, the market for vegetable soybeans in the United States has increased, reaching 10,000 tons in 2000 [29,30].However, the demand for vegetable soybeans cannot be met due to the lack of high-yielding soybean varieties.China, the country of origin for soybeans, has the most enormous soybean genetic resources worldwide.Thus, we analyzed the genetic structure of vegetable soybeans and GWAS was used to identify beneficial genes for molecular breeding.
The GWAS method was used to analyze and calculate the association between genotype and phenotype variation, and to dissect the genetic basis of important traits [31].For instance, in the study of maize, the phenotypic variation of grain yield per plants and tassel branch number was reported as 42.37% and 49.79%, respectively; the coefficient of variation with grain width and 100-kernel weight was 11% and 19%, respectively [32].Additionally, plant structure, especially the number of nodes and branches, largely determines the pod number and yield of soybeans.Furthermore, other plant structures, including the number and size of pods, are also major components affecting the structure of the plant and grain yield.Therefore, it is of great significance to explore candidate genes for yield-related traits in vegetable soybeans to improve soybean yield through precise genetic engineering and breeding.
In this study, 188 vegetable soybean accessions were divided into 9 categories using population structure analysis, which indicated some variation within the population [33].In addition, during population analysis, the acceptable distance between candidate genes and markers was determined by LD, and the LD varied among different populations [34].The LD decay distance of the 188 vegetable soybean varieties in this study was 150.00 kb (r 2 = 0.375), which is within the previously reported range (90-574 kb) [35].In a previous study, the QTL region related to plant height was located on ss21,937,082-ss23,937,081 in the Chr13 [23,24], ss22150035 was located in this interval, and Glyma.13G109100 was located on chromosome 13 near the significant SNP ss22150035 for plant height.In addition, Glyma.13G109100showed higher expression levels in the long plant height variety than in the short height variety (Figure 6), indicating that it may be involved in regulating the vegetable soybean plant height.This gene encodes MYB-related transcription factors, a highly conserved transcription factor family in eukaryotes, which is involved in many developmental processes such as root hair development, pollen formation, seed germination, flower stem strength, and yield.It also plays a role in abiotic stresses such as drought, ultraviolet light, cold stress, high temperature stress, and salt stress [36,37].Meanwhile, MYB-related transcription factors could also reduce cell size and regulate plant structure, such as internode length, petiole length, leaf area, and plant height via the brassinosteroid (BR) pathway [36].Other studies have also shown that OsMYB110 not only changes plant height but also endows rice yield by increasing the number of grains per panicle and grain size [28].Therefore, Glyma.13G109100 was predicted to be a candidate gene that regulates vegetable soybean plant height.
In a recent report, 294 seed weight-related SNPs were identified in soybean chromosomes (http://www.soybase.org/)[16,38].In addition, some identified SNP S have been identified to correlate with yield traits, such as seed size, maturity, flowering time, and plant height [39].For the seed weight, Copley et al. have shown that the SNP site of Chr09: 17708693, was important to regulate the seed weight, and was 0.78 Mb away from the SNP (ss18491673) on chromosome 9 in our study.Based on the annotation information, Glyma.09G102300 and Glyma.09G102200 were identified as a candidate gene to regulate vegetable soybean pod weight and seed yield.Among them, Glyma.09G102300mainly encodes F-box-related proteins, a family of proteins that exist widely in eukaryotes and contain the F-box domain, which plays an important role in the cell cycle, transcription, apoptosis, cell signal transduction, growth and development [40,41].In addition, the F-box proteins, OsFBX206 and OsFBK12, could regulate ethylene synthesis and boisterousness to improve grain size and yield in rice and Medicago truncatula [42][43][44].Meanwhile, we found that the NADH dehydrogenase1 alpha sub-complex was predicted to interact with Glyma.09G102300.The stability of NADH dehydrogenase1 has been demonstrated to play an important role in the regulation of seed germination, yield, and growth retardation in Arabidopsis and maize [45,46].Moreover, the expression level of Glyma.09G102300 in the V071 was higher than V030 in our study.Therefore, Glyma.09G102300 can be regarded as a key candidate gene for regulating vegetable soybean pod weight.Some subfamilies of CYP genes have been shown to influence both grain size and number [47,48].In a recent report, GmCYP78A72 was over-expressed in soybeans and proved that it could increase pod size, which is an extremely desirable trait for enhancing productivity [49].In addition, other studies have shown that TaCYP78A participates in the auxin synthesis pathway and promotes auxin accumulation and cell wall remodeling through the transcriptome and hormone metabolome analyses [50].These studies suggest that members of the CYP78A subfamily that encode cytochrome P450 generate growth signals that enhance plant fruit and seed (grain) development.Although the CYP subfamily is conserved in angiosperms, the contribution to the physiological function of the CYP72A subfamily is largely unknown.Recently, Gunupuru et al. identified that TaCYP72A had a positive effect on grain number in wheat [51].In rice, the CYP724B1 could control grain size by affecting the expression level of BR-related genes [52,53].Therefore, CYP72A also plays a positive regulatory role for grain weight.In this study, we identified a candidate gene related to the seed yield of vegetable soybean, Glyma.09G102200, which encodes CYP72A154 and belongs to the CYP72A subfamily.The expression level of Glyma.09G102200showed significant differences between the V030 and V071 vegetable soybean at the R5, R6, and R7 stages (the fresh pod weight of V030 was lower than that of the V071 accession).It is suggested that CYP72A154 may be a candidate gene to improve vegetable soybean seed weight.Therefore, Glyma.09g102300 and Glyma.09g102200 were predicted to be the potential candidate genes for regulating vegetable soybean pod weight.These results indicate that Plants 2024, 13, 1442 11 of 15 SNP-GWAS is a simple, effective, and robust technique for identifying candidate genes associated with complex yield-related traits in vegetable soybean.

Plant Material and Field Experiments
The vegetable soybean in this study consisted of 188 different genotypes, including 115 and 58 genotypes originating from South and North China, and 14 genotypes from North America, India, Brazil, and Thailand.Of the 188 accessions, 44 were landraces and 144 were improved cultivars (Table S1).Field trials were conducted at Yazhou Bay Area (109.174313E,18.352212N) in Hainan Province, China, in December of 2021 and 2022, respectively.A single-row plot randomized complete block design was used with three independent repeated planting experiments, respectively.As previously described, the soybean germplasm planting follows normal agronomic practices with some modifications [54].A total of 60 vegetable soybean seeds were sown in three rows, 3 m in length, and spaced 0.5 m apart.Five randomly chosen plants from each row were recorded for phenotypic data at the R6 growth stage.Plant height was measured from above the surface of the soil to the tip of the main stem.Likewise, pod numbers were counted for every plant.The fresh pod weight was calculated with all the pods from every vegetable soybean.In addition, 10 pods were randomly selected from each plant to measure pod length, pod width, and pod thickness.

Analysis of Vegetable Soybean Yield-Related Traits
The vegetable soybean yield traits measured in this study included plant height, pod number, fresh pod weight, per plant pod width, pod thickness, and pod length at the R6 stage.Correlation of vegetable soybean yield-related traits data were analyzed by R package (https://cran.r-project.org/web/packages/Hmisc/index.html,accessed on 5 March 2023).The standard deviation and coefficient of variation of vegetable soybean yield traits were calculated using SPSS16 software.Coefficient of variation = SD (standard deviation)/mean.

Genotypic Data
DNA was extracted from vegetable soybean leaves using CTAB (hexadecyl trimethyl ammonium bromide, C8440, Solarbio, Beijing, China) [30].At the same time, DNA libraries were prepared by the restriction enzyme ApeKI (R0643S, NEB, Ipswich, MA, USA) and sequenced [55,56].At the Shenzhen Huada Gene Research Institute, we obtained SLAF tags by digestion of each soybean accession leaf's DNA, fragment ligation, PCR amplification, and selection of target fragments for SLAF libraries.The Illumina HiSeqTM 2500 platform (Illumina, Inc., San Diego, CA, USA) was performed to generate 45 bp sequence reads at both ends of the sequencing tags from each accession library.The alignment of the obtained raw paired end reads with the reference genome (Glycine max Wm82.a2.v1) was performed using BWA software (Version: 0.6.1-r104).SAMtools 48 software was employed to convert the mapping results into the BAM format and to effectively filter out unmapped and non-unique reads.The BEDtools software (Version: 2.17.0) was utilized to calculate the sequence alignment coverage.The detection of SNPs was performed using the GATK (version 2.4-7-g5e89f01) and SAMtools software.The SNP annotation was conducted and analyzed based on the soybean genome (Glycine max Wm82.a2.v1) using the ANNOVER package.

Population Analysis
The 188 accessions of soybean were re-sequenced and approximately 10 Gb of raw data were collected from each genome.TRIMMOMATIC (v.0.36) was used to filter the adaptors and low-quality bases of the raw data, and BOWTIE 2 (v.2.4.1) were used to map clean data to the Williams82.v2.1 genome with the default parameters [57].Sentieon software (https://www.sentieon.com/)was used for the global realignment of reads and to produce VCF files.VCF tools were further employed to filter the raw VCF file with the following parameters (minDP 4, min-alleles 2, max-alleles 2, minQ 30, maxDP 60, max missing 0.9, maf 0.05), nucleotide diversity (p) (within a 500 kb window with no overlap), and Tajima's D (with a 500 kb window with no overlap) [58].Population structure analysis was performed using the admixture [59], and the best k (k = 9) was selected by STRUCTURE HARVESTER (v.2.3) [60].A phylogenetic tree was constructed using the neighbor-joining method implemented in PHYLIP and displayed in iTOL [61,62].PopLDdecay was used to analyze the population [63,64].

Genome-Wide Association Study and Candidate Gene Analysis
The phenotypic and genotypic data were used for GWAS.BLINK is an improved model version of fixed and random model cyclic probability unity (Farm CPU), which is statistically powerful and efficient in identifying significant SNPs associated with important traits [24].Thus, the BLINK model was used to draw the Manhattan and QQ plot during the process of GWAS.Meanwhile, BLINK-generated data were used in R package [65].All potential candidate genes were mined within 200 kb around the SNP sites.Gene expression and annotation analyses were performed using the Soybean Database (https://www.soybase.org/),which contains information on vegetable soybean yieldrelated traits.

Real-Time Quantitative Reverse Transcription PCR
To analyze the expression patterns of Glyma.09G102300,Glyma.09G102200,Glyma.03g183200, and Glyma.13G109100 in V071 and V030, the pod and main stem at the R5, R6, and R7 periods were harvested, and RNA was extracted by using RNAiso plus (AJ31085A, TaKaRa, Japan).Total RNA (1 ug) was reverse transcribed into cDNA by Superscript IV reverse transcriptase (Thermo Fisher, Waltham, MA, USA).Real-time PCR amplification was performed in a 20 µL reaction: 1 µL cDNA, 0.2 mM of forward primer/reverse primer in 0.5 µL, 10 µL SYBR Green PCR Master Mix (Takara, Japan), and 8 µL ddH 2 O.The quantitative real-time PCR reaction conditions are as follows: 95 • C for 5 min, followed by 40 cycles at 95 • C for 10 s, and 62 • C for 25 s.The GmEF1A gene was used as an internal control, and the relative expression data were calculated using the method of 2 −∆∆CT .The primers used are listed in Table S2.

Conclusions
In this study, we genotyped 188 soybean accessions and evaluated their yield-related traits.We identified 116 significant SNPs associated with 3 traits.Among them, five SNPs were close to the known QTL interval for the soybean plant height, pod number, and pod weight traits.Based on gene annotation and expression analyses, Glyma.09G102300,Glyma.09G102200,Glyma.03g183200, and Glyma.13G109100 were identified as candidate genes related to vegetable soybean pod number, fresh pod weight, and plant height, respectively.The results suggest that yield-related traits are controlled by a complicated regulatory network and high-yield soybean genotypes may have different combinations of genetic networks.This study has provided a better understanding of the genetic architecture and could help to design high yielding vegetable soybean varieties through marker-assisted breeding, and by using forward and reverse genetic approaches.

Figure 2 .
Figure 2. Population structure and linkage disequilibrium analysis of 188 vegetable soybeans.(A).Cross validation error rate for 188 samples based on clustering.(B).Bar plot divides the population of 188 vegetable soybeans into 10 cluster and every color represents a cluster.(C).Evolutionary tree of 188 soybean germplasm.(D).LD analysis for the 188 vegetable soybean germplasm.

Figure 2 .
Figure 2. Population structure and linkage disequilibrium analysis of 188 vegetable soybeans.(A).Cross validation error rate for 188 samples based on clustering.(B).Bar plot divides the population of 188 vegetable soybeans into 10 cluster and every color represents a cluster.(C).Evolutionary tree of 188 soybean germplasm.(D).LD analysis for the 188 vegetable soybean germplasm.

Figure 3 .
Figure 3. Genome-wide association analysis Manhattan plots and QQ plots of 188 vegetable soybean accession for the fresh pod weight (A), number of pods (B), and plant height (C).Figure 3. Genome-wide association analysis Manhattan plots and QQ plots of 188 vegetable soybean accession for the fresh pod weight (A), number of pods (B), and plant height (C).

Figure 3 .
Figure 3. Genome-wide association analysis Manhattan plots and QQ plots of 188 vegetable soybean accession for the fresh pod weight (A), number of pods (B), and plant height (C).Figure 3. Genome-wide association analysis Manhattan plots and QQ plots of 188 vegetable soybean accession for the fresh pod weight (A), number of pods (B), and plant height (C).

Figure 4 .
Figure 4. Haplotype analysis for the yield-related traits and candidate genes in 188 vegetable soybeans.(A).Haplotype analysis for the plant height and candidate genes of vegetable soybean (p < 0.05); (B).Haplotype analysis for the pod number and candidate genes of vegetable soybean (p < 0.05); (C,D).Haplotype analysis for the fresh pod weight and candidate genes of vegetable soybean (p < 0.05).

Figure 4 .
Figure 4. Haplotype analysis for the yield-related traits and candidate genes in 188 vegetable soybeans.(A).Haplotype analysis for the plant height and candidate genes of vegetable soybean (p < 0.05); (B).Haplotype analysis for the pod number and candidate genes of vegetable soybean (p < 0.05); (C,D).Haplotype analysis for the fresh pod weight and candidate genes of vegetable soybean (p < 0.05).

Figure 5
Figure5.Linkage disequilibrium analysis for the candidate genes related to vegetable soybean yield traits. (A).Linkage disequilibrium analysis for the plant height-related candidate genes (Glyma.13g109100) in the 188 vegetable soybeans.(B).Linkage disequilibrium analysis for the pod number-related candidate genes (Glyma.03g183200) in the 188 vegetable soybeans.(C, D).Linkage disequilibrium analysis for the fresh pod number-related candidate genes (Glyma.09g102200and Glyma.09g102300) in the 188 vegetable soybeans.

Figure 6 .
Figure 6.Expression analysis of potential candidate genes in V030 and V071 at three growth opmental stages.(A).Relative expression analysis of vegetable soybean pod number-related date gene of Glyma.03g183200.(B,C).Relative expression analysis for the vegetable soybea pod weight-related candidate gene of Glyma09g102200 and Glyma09g102300.(D).Relative e sion analysis for the plant height-related candidate genes of Glyma.13g109100(* indicate p <

Figure 6 .
Figure 6.Expression analysis of potential candidate genes in V030 and V071 at three growth developmental stages.(A).Relative expression analysis of vegetable soybean pod number-related candidate gene of Glyma.03g183200.(B,C).Relative expression analysis for the vegetable soybean fresh pod weight-related candidate gene of Glyma09g102200 and Glyma09g102300.(D).Relative expression analysis for the plant height-related candidate genes of Glyma.13g109100(* indicate p < 0.05).

Table 1 .
Statistics and difference analysis of yield-related traits in the 188 vegetable soybean accessions.

Table 1 .
Statistics and difference analysis of yield-related traits in the 188 vegetable soybean sions.

Table 2 .
List of significant SNPs associated with vegetable soybean yield-related traits and candidate genes.